Computational Study of Stylistics: A Clustering-Based Interestingness Measure for Extracting Relevant Syntactic Patterns
نویسندگان
چکیده
In this contribution, we present a computational stylistic study of the French classic literature texts based on a data-driven approach where discovering interesting linguistic patterns is done without any prior knowledge. We propose an objective interestingness measure to extract meaningful stylistic syntactic patterns from a given author’s work. Our hypothesis is based on the fact that the most characterising linguistic patterns should significantly reflect the author’s stylistic choice in that the positions of theirs occurrences are controlled by the author’s purpose, while the irrelevant linguistic patterns are distributed randomly in the text. Since it does not rely on the counts of occurrences of the syntactic patterns in texts, this measure can work reasonably well with both large and small text samples. The analysed results show the effectiveness in extracting interesting syntactic patterns from a single text, and this seems particularly promising for the analyses of such texts that, for their characteristics or for historical reasons, cannot support a comparative study.
منابع مشابه
A Peculiarity-based Exploration of Syntactical Patterns: a Computational Study of Stylistics
In this contribution, we present a computational stylistic study and comparison of classic French literary texts based on a datadriven approach where discovering interesting linguistic patterns is done without any prior knowledge. We propose an objective measure capable of capturing and extracting meaningful stylistic syntactic patterns from a given author’s work. Our hypothesis is based on the...
متن کاملCombining Clustering techniques and FCA to characterize Interestingness Measures
Formal Concept Analysis "FCA" is a data analysis method which enables to discover hidden knowledge existing in data. A kind of hidden knowledge extracted from data is association rules. Di erent Interestingness Measures "IMs" were reported in the literature to extract only relevant association rules. Given a dataset, the choice of a good interestingness measure remains a challenging task for a ...
متن کاملInterestingness Measures for Rare Association Rules and Periodic-Frequent Patterns
Data mining is the process of discovering significant and potentially useful knowledge in the form of patterns from the data. As a result, the notion of interestingness is very important for extracting useful knowledge patterns. Numerous interestingness measures have been discussed in the literature to assess the interestingness of a knowledge pattern. In this thesis, we focus on selecting a ri...
متن کاملAssessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories
In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information whic...
متن کاملThe Impact of Different Frequency Patterns on the Syntactic Production of a 6-year-old EFL Home Learner: A Case Study
This longitudinal study investigated the impact of different Frequency Patterns (FP) on the syntactic production of a six-year-old EFL learner in a home context. Target syntactic constructions were presented using games and plays and were traced for their occurrence patterns in input and output. Following each instruction period, the constructions were measured through immediate and delayed ora...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Int. J. Comput. Linguistics Appl.
دوره 6 شماره
صفحات -
تاریخ انتشار 2015